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BASE-MODIFIED NUCLEOTIDES AND THEIR USE FOR POLYMORPHISM 

DETECTION 

5 FIELD OF THE INVENTION 

[0001] The present invention relates generally to organic chemistry, analytical 
chemistry, biochemistry, molecular biology, genetics, diagnostics and medicine. In 
particular, it relates to base-modified nucleotides and methods for their use for the 
detection polymorphisms (SNPs). 
1P RELATED APPLICATIONS 

U [0002] This application is a continuation-in-part of U.S. Ser. No. 09/394,467 to 
* Stanton, Wolfe, and Verdine, filed September 10, 1999, entitled "A METHOD FOR 

5 : !i 

II ANALYZING POLYNUCLEOTIDES." Ser. No. 09/394.467 in tum claims the benefit of 
fij U.S. Provisional Patent Application, serial number 60/1 02,724, filed October 1 , 1 998, 
15 also entitled "A METHOD FOR ANALYZING POLYNUCLEOTIDES." Both are 
€1 incorporated by reference in their entireties, including drawings and tables, as if fully set 
forth herein. 

f 3 BACKGROUND OF THE INVENTION 

[0003] The following is offered as background infomnation only and is not intended 

20 nor admitted to be prior art to the present invention. 

[0004] The ability to detect DNA sequence variances in an organism's genome has 
become an important tool in the diagnosis of diseases and disorders and in the prediction of 
response to potential therapeutic regimes. It is becoming increasingly possible, using early 
variance detection, to diagnose and treat, even prevent, a disorder before it has physically 

25 manifested itself. Furthermore, variance detection can be a valuable research tool in that it 
may lead to the discovery of genetic bases for disorders the cause of which were hitherto 
unknown or thought to be other than genetic. 

[0005] It is estimated that sequence variations in human DNA occur with a frequency 
of about 1 in 100 nucleotides when 50 to 100 individuals are compared. Niclcerson, D.A., 
30 Nature Genetics . 1998, 223-240. This translates to as many as 30 million variances in the 
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human genome. However, very few of these variances have any effect on the physical 

well-being of humans. Detecting these 30 million variances and then detennining which of 

them are relevant to human health is clearly a fonnidable task. 

[0006] Once the DNA sequence of a DNA segment; e.g., a gene, a cDNA or, on a 
5 larger scale, a chromosome or an entire genome, has been determined, the existence of 

sequence variances in that DNA segment among members of the same species can be 

explored. Complete DNA sequencing is the definitive procedure for accomplishing this task. 

However, current DNA sequencing technology is costly, time consuming and, in order to 

assure accuracy, highly redundant. Most sequencing projects require a 5- to 10-fold 
10 coverage of each nucleotide to reach an acceptable en'or rate of 1 in 2,000 to 1 in 10,000 

bases. In addition, DNA sequencing is an inefficient way to detect variances. A variance 
o between two copies of a gene, for example when two chromosomes are being compared, 
f;!: may occur as infrequently as one in 1 ,000 or more bases. Thus, only a small segment of 
p the gene is of Interest. If full sequencing is employed, a tremendous number of nucleotides 
1p have to be sequenced to amve at the desired information contained in that segment. For 
l^, example, to compare ten versions of a 3,000 nucleotide DNA sequence for the purpose of 

detecting four variances among them, even if only 2-fold redundancy is employed (each 

strand of the double-stranded 3,000 nucleotide DNA segment from each individual is 
^1 sequenced once), 60,000 nucleotides would have to be sequenced (10 X 3,000 X 2). In 
20 addition, sequencing problems are often encountered that can require additional runs with 

new primers. Thus, as many as 100,000 nucleotides might have to be sequenced to 

determine four variances. 

[0007] What is needed is a rapid, inexpensive, yet accurate method to identify 
variances such as SNPs among related polynucleotides. The present invention provides 

25 such a method and materials for its implementation. 

SUMMARY OF THE INVENTION 
[0008] Thus, in one aspect, the present invention comprises a method for 
detecting a polymorphism in a polynucleotide. The method comprises, first, providing a 
target polynucleotide. A natural nucleotide is replaced at greater than 90% of its points 

30 of occurrence in the target polynucleotide, provided the points of occun-ence are not in 
a primer sequence, with a base-modified nucleotide to give a modified target 
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polynucleotide. The modified target polynucleotide is contacted with a reagent or 
combination of reagents that cleaves it at greater than 90% of the base-modified 
nucleotides to give a set of fragments. The set of fragments is then analyzed to detect 
a polymorphism. In this method the base-modified nucleotide is selected from the 
5 group consisting of: 
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R R 



R is a ribose or a deoxyribose moiety of a polynucleotide. and R^ are independently 
selected from the group consisting of hydrogen, alkyi, cycloalkyi, alkenyl, alkynyl, aryl, 
aralkyi and alkaryl, wherein, if or R^ contains two or more contiguous methylene 
30 ( -CH2-) groups, any two such methylene groups may have interjected between them a 
group selected from the group consisting of -0-, -C(0)NH-, -C(0)NHC(0)-, -NH-, 
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-C(S)NH-, -CO-, -CS-, -S- and (-CFa-^. Furthermore, m is 1- 10. R^s hydrogen or 
-NHg. And, finally, n is 0, 1 or 2. 

[0009] In a further aspect of this invention, the base-modified nucleotide is 
selected from the group consisting of: 




R 



10 R is a ribose or deoxyribose moiety of an oligonucleotide. 

c [0010] In an aspect of this invention, contacting the base modified polynucleotide 
y with a reagent or reagents comprises contacting it with a chemical base, 
y [001 1] In an aspect of this invention the chemical base comprises a secondary 
H amine having a boiling point above 100° C at atmospheric pressure. 
lW [0012] In an aspect of this invention, the secondary amine has a boiling point 
P above 1 SO"* C at atmospheric pressure. Most preferably, the secondary amine has a 
O boiling point greater than 200" C at atmospheric pressure. 

|| [0013] In an aspect of this invention, the secondary amine is selected from the 
W group consisting of 3-pyrrolidinol, 2-pyrrolidinemethanol, 3-pyrrolidinemethanol, 4- 
20 hydroxypiperidine and 4-piperidineethanol. 

[001 4] In an aspect of this Invention, analysis of the fragments comprises gel 

electrophoresis. 

[001 5] In as aspect of this invention, analysis of the fragments comprises mass 
spectrometry. 

25 [0016] In an aspect of this invention, mass spectrometry comprises MALDI mass 
spectrometry. 

[0017] In an aspect of this invention, mass spectrometry comprises ESI mass 
spectrometry. 

[0018] In an aspect of this invention, analyzing the fragments comprises 
30 comparing the masses of the fragments with masses of fragments predicted if the 
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polymorphism is present or witli masses of fragments predicted if the polymorphism is 
not present. 

[001 9] An aspect of this Invention relates to a method for detecting a 
polymorphism in a polynucleotide, comprising replacing a natural nucleotide at greater 
5 than 90% of its points of occunrence in the target polynucleotide, provided the points of 
occun'ence are not in a primer sequence, with a modified nucleotide to give a modified 
target polynucleotide. The modified target polynucleotide is contacted with a secondary 
amine having a boiling point greater than 100° C at atmospheric pressure, at a 
temperature that results in cleavage of the modified polynucleotide at greater than 90% 

10 of the modified nucleotides, to give a set of fragments. The fragments are then 

p.* 

P analyzed to detect a polymorphism. 

P 

J* [0020] In an aspect of this invention, the above secondary amine has a boiling 

point greater than 1 50° C at atmospheric pressure. 
H [0021] In an aspect of this invention, the secondary amine has a boiling point 
15 greater than 200° C at atmospheric pressure. 

rf [0022] In an aspect of this invention, the secondary amine is selected from the 
C| group consisting of 3-pynx)lidinol, 2-pyn-olidinemethanol, 3-pyn-olidinemethanol, 4- 
p hydroxypiperidine and 4-piperidineethanol. 

- - [0023] In an aspect of this invention, the polynucleotide is contacted with a 
20 chemical oxidant prior to contact with the secondary amine. The chemical oxidant 
comprises potassium pemnanganate in an presently preferred aspect of this invention. 
[0024] In any of the above methods, the percentage replacement of a natural 
: nucleotide with a modified nucleotide, the percentage cleavage of a modified 
polynucleotide or both the percentage replacement and the percentage cleavage is 
25 greater than 95% in an aspect of this invention. 

[0025] In any of the above methods, the percentage replacement of a natural 
nucleotide with a modified nucleotide, the percentage cleavage of a modified 
polynucleotide or both the percentage replacement and the percentage cleavage is 
greater than 99% in an aspect of this invention. 

30 
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DETAILED DESCRIPTION OF THE INVENTION 
BRIEF DESCRIPTION OF THE TABLES 
5 [0026] Table 1 shows the molecular weights of the four DNA nucleotide 
monophosphates and the mass difference between each pair of nucleotides. 
[0027] Table 2 shows the masses of all possible 2mers, Smers, 4mers and 
5mers of the DNA nucleotides in Table 1 . 

[0028] Table 3 shows the masses of all possible 2mers, Smers, 4mers, Smers, 
10 6mers and 7mers that would be produced by cleavage at one of the four nucleotides 
b and the mass differences between neighboring oligonucleotides. 
% [0029] Table 4 shows the 8 sets of isobaric (masses within 0.01 % of each other) 
W oligonucleotides that are found among all oligonucleotides up to SOmers. 
tk [0030] Table 5 shows the mass changes that will occur for all possible point 
li- mutations (replacement of one nucleotide by another) and the theoretical maximum size 
CI of a polynucleotide in which a point mutation should be detectable by mass 
Q spectrometry using mass spectrometers of varying resolving powers. 
2 [0031 ] Table 6 shows the expected molecular weights for the commercial RFC 
PJ primer, RFC mut primer and RFC mut primer with a G deletion. 
20 BRIEF DESCRIPTION OF THE FIGURES 

[0032] Figure 1 shows the sequence of oligonucleotides used to examine the 
cleaving ability of various secondary amines [SEQ. IDs. 1 - 6] 
[0033] Figure 2 shows the result of cleavage of the oligonucleotides depicted in 
Fig. 1 using 2-pyn-olidinemethanol, 3-pyrrolidinol and 4-piperidineethanol. 
25 [0034] Figure 3 shows the result of cleavage of the oligonucleotides depicted in 
Fig. 1 using 3-pyrrolidinol at a higher temperature. 
[0035] Figure 4 shows a synthetic route to 3-pyrrolidinemethanol. 
[0036] Figure 5 is a comparison of the results of cleavage of the oligonucleotides 
depicted in Fig. 1 using S-pyn'olidinol and 3-pyrrolidinemethanol. 
30 [0037] Figure 6 is a schematic representation of genotyping by chemical 

cleavage. The template is amplified using one cleavable nucleotide analog, dA*TP. 
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The amplicons are chemically cleaved to give fragments with the indicated length and 
mass differences. The fragments obtained can be analyzed by mass spec of 
electrophoresis. 

[0038] Figures 7-13 show various aspects of genotyping using the methods of 
5 this invention. 

[0039] Figure 7(a) show an 82 bp fragment of transferrin receptor containing the 
indicated polymorphism [SEQ. IDs 7 and 8], is amplified using a modified nucleotide, 
dA*TP, the structure of which is shown. 

[0040] Figure 7(b) shows the fragments [SEQ. IDs 9 - 1 5] expected from 
1ft cleavage at the modified nucleotide of 7(a). 

£! [0041] Figure 8 illustrates genotyping by detection by mass differences obtained 
J from the amplification and cleavage of the variant forms of transferrin receptor. Only 
m the fragments that illustrate the length and mass differences among the fragments of 
f^- the same (invariant) and different (variant) alleles are shown. 
1^ [0042] Figure 9 shows the mass spectra of the three possible genotypes of the 
g transfenin receptor gene used in Fig. 8. 

CI [0043] Figure 10 is another illustration ofgenotyping by mass spectrometry. The 
spectrum is a MALDI-TOF analysis of a chemically cleaved DNA fragment. The boxed 

Pi I 

■ areas are regions that contain fragments with polymorphism. 
20 [0044] Figure 1 1 illustrates genotyping by chemical cleavage followed by 
electrophoresis. The capillary electrophoresis analysis of a chemically cleaved 
polymorphic DNA fragment is depicted. 

[0045] Figure 12 shows the denaturing 20% PAGE analysis of the chemically 
cleaved amplicon of Fig. 11. 

25 [0046] Figure 13 (a) - (d) illustrate genotyping by fluorescence resonance 

energy transfer (FRET): (A) amplify template using one modified, cleavable nucleotide 
(DA*TP). Primer 1 is modified with a fluor, F1 ; (B) after cleavage a probe modified with 
a second fluor, F2, and complementary to primer 1 is added; (C) at elevated 
temperature, the allele shortened by cleavage is not bound to the probe (and, therefore, 

30 no FRET is produced) while the uncleaved allele remains bound giving a FRET. (D) 
shows a means for positive detection of the short fragment by modifying the probe to 
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contain a hairpin and an additional fluor, F3. The hairpin will open only after binding 
with the longer, uncleaved fragment resulting in a difference in FRET production. 
DEFINITIONS 

[0047] As used herein, the term "detecting" refers to the determination of the 
5 presence or absence of a variance, In particular one or more single nucleotide 

polymorphisms (SNPs) In the nucleotide sequence of a polynucleotide when compared 
to a related polynucleotide. 

[0048] As used herein, a "reagent" refers to a chemical entity or physical force 
that causes the cleavage of a modified polynucleotide at polnt(s) where a modified 
1A nucleotide Is substituted for a natural nucleotide. Such reagents Include, without 
O limitation, a chemical or combination of chemicals, nomnal or coherent (laser) visible or 
jI uv light, heat, high energy ion bombardment and irradiation. A "combination of 
U reagents" refers two or more reagents, which can be used simultaneously or 
N sequentially. By simultaneously is meant that the two or more reagents are together 
lS placed in contact with a modified polynucleotide to be cleaved although it is understood 
P that they may in fact react with the polynucleotide one at a time. By sequentially is 
O meant that the polynucleotide is contacted with one reagent and then, when that 
ft reaction is complete, a second reagent is added, and so on. For instance, as described 
in the Examples section of this disclosure, it may be necessary or desirable to contact a 
20 modified polynucleotide of this invention with an oxidizing agent prior to contacting it 
with a chemical base to effect cleavage. 

[0049] As used herein, the terms "cleaving," "cleaved" and "cleavage" all relate to 
the scission of a polynucleotide chain at substantially each point of occurence in the 
polynucleotide chain of a base-modified nucleotide of this invention. The 

25 polynucleotide chain may be single-stranded or double-stranded. When primers are 
used to amplify or otherwise replicate a template to obtain a version of the 
polynucleotide with a base-modified nucleotide incorporated in place of each 
con-esponding natural nucleotide, i.e., to create a modified polynucleotide, it is to be 
understood that the primer does not take part in the replacement or cleavage reaction. 

30 That Is, no natural nucleotide in the primer is replaced with a modified nucleotide and 
the primer Is not cleaved. 
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[0050] As used herein, a "related" polynucleotide is a polynucleotide obtained 
from a source genetically similar to that from which another polynucleotide is obtained 
such that the nucleotide sequences of the two polynucleotides would be expected to be 
exactly the same in the absence of a variance. As used herein, polynucleotides that 
5 have overlapping sequences of 35 nucleotides or more that, in the absence of a 
variance, would be exactly the same are considered ''related" polynucleotides. 
[0051] As used herein, a "variance" is a difference in nucleotide sequence 
among related polynucleotides. Except as othen/vise stated, the term "mutation" and is 
used interchangeably with "variance" herein. A variance may involve the addition or 

10 deletion of a nucleotide from the sequence of one polynucleotide compared to the 
IZ. sequence of a related polynucleotide. Or, it may be the substitution of one nucleotide 

ls:# 

P for another. As used herein, the temi "variance" in the singular is understood to include 

yj multiple variances; i.e., two or more nucleotide additions, deletions and/or substitutions 

11 in the same polynucleotide. A particular type of variance is the "polymorphism" or 
i& "single nucleotide polymorphism," which, as the name suggests, is a variance 

p consisting of a single substitution of one nucleotide for another, 

[0052] Thus, as used herein, a "single nucleotide polymorphism" or "SNP" refers 
61 to a polynucleotide that differs from another polynucleotide at a particular locus by 
pj virtue of a single nucleotide exchange. A polynucleotide may, of course, contain 
20 numerous SNPs; however, each must occur at a different locus and consist of a single 
nucleotide exchange. For example, exchanging one A for one C, G or T at a particular 
locus in the sequence of a polynucleotide constitutes a SNP. When referring to SNPs, 
the polynucleotide is most often genetic DNA. As such, to qualify at a SNP, the 
polymorphism must occur at a frequency greater than 1% in a given population. SNPs 
25 can occur in coding and non-coding regions of the gene. Those in coding regions are 
of primary interest because it is they that cause changes in the phenotype, i.e., a 
detectable physical difference, of an individual compared to the general population. 
Detectable physical differences include, without limitation, a difference in susceptibility 
to a particular disease or disorder or a difference in response to a therapeutic regime 
30 used to treat or prevent a disease or disorder. 
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[0053] As used herein a "target polynucleotide" simply refers to a polynucleotide 
that is suspected to contain a variance and therefore is being subjected to the method 
of this invention to detemriine whether or not it does. 

[0054] As used herein, a "reference polynucleotide" refers to a polynucleotide 
5 that is related to a test polynucleotide but is known to either contain or not contain the 
subject polymorphism. 

[0055] As used herein the phrase "suspected of containing a variance" refers to a 
test polynucleotide in which a difference in the nucleotide sequence at a particular locus 
is known generally to occur in some individuals compared to the general population but 
10 it is unknown whether that difference exists in the test polynucleotide. When the test 

|..i.v 

and related nucleotides are genetic DNA, the difference in their sequences consist of 
t| the exchange of a single nucleotide at a given position in the sequence and the 
m frequency of such exchange in the population is 1 % or greater, the test polynucleotide 
Ci„ may be characterized as being "suspected of containing a polymorphism (or SNP)" 
-fl [0056] As used herein, "amplifying a segment" refers to the process of producing 
0 multiple copies of a segment of a double stranded polynucleotide by hybridizing natural 
K nucleotide primers 3' to the segment on each strand and then treating the strands with 
SI one or more polymerases to extend both strands. As a result of using two strands and 
PJ two primers, the process becomes logarithmic. The most common procedure for 
20 accomplishing amplification is the polymerase chain reaction or PGR, which is well 
known to those skilled in the art. The end result of amplification is the production of a 
sufficient amount of the segment to permit relatively facile manipulation. Manipulation 
refers to both physical and chemical manipulation, that is, the ability to move bulk 
quantities of the segment around and to conduct chemical reactions with the segment 
25 that result in detectable products. 

[0057] As used herein, "primer extension" refers to the reproduction of the 
sequence of a segment of a polynucleotide by hybridization of a natural nucleotide 
primer to the polynucleotide 3' of the segment followed by treatment with a polymerase 
and four nucleotides, one or more of which may be a modified nucleotide, to extend the 
30 primer and create a copy of the segment. 
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[0058] As used herein a "segment" of a polynucleotide refers to a portion of the 
complete nucleotide sequence of the polynucleotide. 

[0059] As used herein a "modified segment" refers to a segment in which one or 
more natural nucleotides have been replaced with one or more base-modified 
5 nucleotides. 

[0060] As used herein, a "modified, labeled segment" refers to a modified 
segment that also contains an entity that is readily detectable, visually or by 
instrumental means. 

[0061] As used herein, the phrase "encompassing the suspected polymorphism" 
10 means that the nucleotide or nucleotides that constitute a variance^ are included in the 
^ sequence of a selected segment of the polynucleotide. 

0 [0062] By "homozygous" is meant that the two alleles of a diploid cell or organism 
j^i have exactly the same nucleotide sequence. 

pi [0063] By "heterozygous" is meant that the two alleles of a diploid cell or 
1^ organism have a difference in their nucleotide sequence at a particular locus. In most 
L cases, the difference is a SNP. 

N' [0064] A "sequence" or "nucleotide sequence" refers to the order of nucleotide 

O. 

Pi residues in a nucleic acid. 

f% 

^'i [0065] A "nucleoside" refers to a base linked to a sugar. The base may be 
20 adenine (A), guanine (G) (or its substitute, inosine (I)), cytosine (C), or thymine (T) (or 
its substitute, uracil (U)). The sugar may be ribose (the sugar of a natural nucleotide in 
RNA) or 2-deoxyribose (the sugar of a natural nucleotide in DNA). 
[0066] A "nucleoside triphosphate" refers to a nucleoside linked to a triphosphate 
group (0 -P(=O)(O>O-P(=0)(0 )-0-P(=0)(0>0-nucleoside). The triphosphate group 
25 has four formal negative charges that require counter-ions, i.e., positively charged ions. 
Any positively charged ion can be used, e.g., without limitation, Na*, K*, NH/, Mg^*, 
Ca^*, etc. Mg^* is one of the most commonly used counter-ions. It is accepted 
convention in the art to omit the counter-ion, which is understood to be present, when 
displaying nucleoside triphosphates; the convention is followed in this application. 
30 [0067] As used herein, unless expressly noted othenvlse, the tenn "nucleoside 
triphosphate" or reference to any specific nucleoside triphosphate; e.g., adenosine 
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triphosphate, guanosine triphosphate or cytidine triphosphate, refers to a triphosphate 
comprising either a ribonucieoside or a 2'-deoxyribonucleoside. 
[0068] A "nucleotide" refers to a nucleoside linked to a single phosphate group. 
[0069] A "natural nucleotide" refers to an A, C, G or U nucleotide when referring 
5 to RNA and to dA, dC, dG and dT (the "d" refening to the fact that the sugar is a 
deoxyribose) when refen-ing to DNA. A natural nucleotide also refers to a nucleotide 
which may have a different structure from the above, but which is naturally incorporated 
into a polynucleotide sequence by the organism which is the source of the 
polynucleotide. 

10 [OO7O] As used herein, a "modified nucleotide" refers to a nucleotide that meets 
Q two criteria. First, a modified nucleotide is a "non-natural" nucleotide. In one aspect, a 

"non-natural" nucleotide may be a natural nucleotide that is placed in non-natural 
W surroundings. For example, in a polynucleotide that is naturally composed of 
Lk deoxyribonucleotides, e.g., DNA, a ribonucleotide would constitute a "non-natural" 

nucleotide. Similarly, in a polynucleotide that is naturally composed of ribonucleotides, 

s 

O i.e., RNA, a deoxyrlbonucleotide would constitute a non-natural nucleotide. A "non- 
p natural" nucleotide also refers to a natural nucleotide that has been chemically altered. 
If For example, without limitation, one or more substituent groups may be added to the 
PJ base, sugar or phosphate moieties of the nucleotide. On the other hand, one or more 
20 substituents may be deleted from the base, sugar or phosphate moiety. Or, one or 

more atoms or substituents may be substituted for one or more others in the nucleotide. 
A "modified" nucleotide may also be a molecule that resembles a natural nucleotide 
little, if at all, but is nevertheless capable of being incorporated by a polymerase Into a 
polynucleotide In place of a natural nucleotide. In particular with respect to the present 
25 invention, the modified nucleotide is a base-modified nucleotide. By "base-modified 
nucleotide" is meant a nucleotide in which the normal heterocyclic nitrogen base, 
adendine, guanine, cytosine, thymine or uracil, is chemically modified by the addition, 
delection and/or substitution of one or more substituents or atoms for that in the nomial 
base. 

30 [0071] The second requirement for a "modified" nucleotide, as the term is used 
herein, is that It alters the cleavage properties of the polynucleotide Into which it is 
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incorporated. For example, without limitation, incorporation of a ribonucleotide into a 
polynucleotide composed predominantly of deoxyribonucleotides imparts a 
susceptibility to alkaline cleavage at the sites of incorporation that does not otherwise 
exist. This second criterion of a "modified" nucleotide may be met by substitution of one 
5 non-natural nucleotide for a natural nucleotide (e.g., the substitution of a ribonucleotide 
for a deoxyribonucleotide described above). It may also be met by substitution of two 
non-natural nucleotides that do not individually alter the cleavage properties of a 
polynucleotide, for their natural counterparts. When in a particular spatial relationship 
to one another in a polynucleotide into which they have been incorporated, enhanced 
10 cleavage of the polynucleotide will occur at the site of incorporation (refen'ed to as 
Q "dinucleotide cleavage"). 

jS [0072] As used herein, "having different cleavage characteristics" refers to two or 
more modified nucleotides that, when incorporated into a polynucleotide, can be 

.11 

Ci. selectively cleaved in each other's presence by using different reagents and/or reaction 
# conditions. 

p [0073] As used herein, a "label" or "tag" refers to a molecule that can be attached 
Q to another molecule, such as, without limitation, a polynucleotide or a segment thereof, 
^1 to provide a means by which the other molecule can be readily detected. In the case of 
W polynucleotides or segments thereof, the attachment can be accomplished by, for 
20 example, covalent bonding or hybridization. Two common types of tags that are useful 
in the methods of this invention are fluorescence (or fluorescent) tags and radiolabels 
or radioactive tags. When excited by light at a selected wavelength, a fluorescence tag 
emits light at a different wavelength that can be detected visually or instrumentally (e.g., 
a UV spectrophotometer). The fluorescing entity is sometimes referred to as a 
25 "fluorophore." A radiolabel or radioactive tag emits radioactive particles detectable with 
an instrument such as, without limitation, a scintillation counter. 
[0074] A "mass-modified" nucleotide is a nucleotide in which an atom or chemical 
group has been added, deleted or substituted for another group solely for the purpose 
of changing the mass of the molecule. That is, it does not alter the cleavage properties 
30 of a polynucleotide into which it is incorporated. 
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[0075] A "polynucleotide" refers to a linear chain of 30 or more nucleoside 5'- 
monophosphate residues linked by phospliodiester bonds between the 3' hydroxyl 
group of one sugar and the 5' hydroxyl group of the next. 

[0076] A "modified polynucleotide" refers to a polynucleotide in which a natural 
5 nucleotide has been substantially completely replaced at each point of its occun-ence 
with a modified nucleotide. It may also refer to the substantially complete replacement 
of two, three or four natural nucleotides with two, three or four modified nucleotides 
where each of the modified nucleotides alters the cleavage properties of the resulting 
modified polynucleotide differently such that cleavage can be earned out independently 

10 for each modified nucleotide. 

p [0077] As used herein, to "alter the cleavage properties" of a polynucleotide 
^1 means to render the polynucleotide more or less susceptible to cleavage at the point of 
yj incorporation of a modified nucleotide than a related polynucleotide having a natural 

11 nucleotide or a different non-natural nucleotide at the same locus. It is presently 
# preferred to "alter the cleavage properties" by rendering a polynucleotide more 

CI susceptible to cleavage at the sites of incorporation of modified nucleotides than at 
m other sites in the molecule. As used herein, the use of the singular when refemng to 
If nucleotide substitution is to be construed as including substitution at substantially each 
pj point of occun^nce of the natural nucleotide unless expressly noted to be otherwise. 
20 [0078] As used herein, a "template" refers to a polynucleotide strand, which a 
polymerase uses as a means of recognizing which nucleotide it should next incorporate 
into a growing strand to duplicate a polynucleotide. If the polynucleotide is DNA, it may 
be single-stranded or double-stranded. When employing the polymerase chain reaction 
(PGR) to amplify a template in the method of this invention, it is understood that, 
25 although the initial copies will be modified by incorporation of modified nucleotides, the 
copies themselves still serve as templates from which a polymerase is able to 
synthesize additional modified copies. 

[0079] As used herein, a "primer" refers to an oligonucleotide formed from natural 
nucleotides, the sequence of which is complementary to a segment of a template to be 
30 replicated. A polymerase uses the primer as the starting point for the replication 

process. By "complementary" is meant that the nucleotide sequence of a primer is such 
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that the primer can stably hybridize to the template by virtue of the formation of 
hydrogen bonded base-pairs over a length of at least ten consecutive bases. In the 
methods of this invention, a primer is never modified by incorporation of a modified 
nucleotide nor is it ever cleaved by the reagent or reagents used to cleave its extension 
5 product. 

[0080] As used herein, a "polymerase" refers, without limitation, to DNA or RNA 
polymerases, mutant versions thereof and to reverse transcriptases. DNA or RNA 
polymerases can be mutagenized by, without limitation, nucleotide addition, nucleotide 
deletion, one or more point mutations, "DNA shuffling" or joining portions of different 
1 0 polymerases to make chimeric polymerases. Combinations of these mutagenizing 
Q techniques may also be used. A polymerase catalyzes the assembly of nucleotides to 
^ form polynucleotides. Polymerases may be used either to extend a primer once or 

U repetitively. Repetitive extension is sometimes referred to as amplification. 

01 

Amplification may be accomplished by, without limitation, PGR, NASBR, SDA, 3SR, 
^ TSA and rolling circle replication, in the methods of this invention, one or more 
C| polymerases and one or more extension or amplification techniques may be used to 
Q replicate a particular polynucleotide. 

g [0081] "Electrophoresis" refers to a technique for separating nucleotide 
nj fragments by size using a gel matrix across which an electrical potential has been 
20 generated. Forms of electrophoresis include, without limitation, slab gel electrophoresis 
and capillary electrophoresis. 

[0082] "Mass spectrometry" refers to a technique for analysis of a chemical 
compound by examination of the masses of the fragments obtained when the 
compound is subjected to an ionizing potential. Forms of mass spectrometry Include, 

25 without limitation, matrix assisted laser desorption ionization (MALDI) and electrospray 
ionization (ESI), optionally employing such features as time-of-flight, quadrupole or 
Fourier transform detection. While the use of mass spectrometry constitutes a 
preferred embodiment of this invention, other instrumental techniques may become 
available for the determination of the mass or the comparison of masses of 

30 oligonucleotides and polynucleotides. Any such instrumental procedure is within the 
scope of this invention. 
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[0083] "FRET" refers to fluorescence resonance energy transfer, a distance 
dependent interaction between the electronic excited states of two dye molecules in 
which energy is transfen-ed from one dye (the donor) to another dye (the acceptor) 
without emission of a photon. To employ FRET in the present invention, the dye 
5 molecules are located on opposite sides of a cleavable modified nucleotide. Cleavage, 
with or without secondary structure fonnation, alters the proximity of the dyes to one 
another resulting in predictable changes in their fluorescence output. 
[0084] FRET can result in quenching, differential light emission or depolarization. 
Quenching occurs when the donor absorbs light at its excitation wavelength and then, 

1 0 instead of emitting light at Its emission wavelength, transfers some or all of its energy to 
h the acceptor, which is not a fluorescing species. If the acceptor is a fluorescing 

^1 species, upon absorbing light from the donor it emits light at its own characteristic 

yj wavelength, which is different from that of the donor. Quantitative differences in the 

01 

11 emissions of the donor and acceptor can be used to deduce information about the 
# molecules to which they are attached. Fluorescent depolarization can be used when 
Q the donor and acceptor are the same molecule. A donor molecule is excited with plane 
h polarized light. If no energy Is transfened to the other molecule, the light emitted by the 
g donor will remain polarized. If, on the other hand, energy is transferred to the acceptor, 
pj which then fluoresces, the emitted light will be depolarized. 

20 [0085] As used herein, a "chemical oxidant" refers to a reagent capable of 

increasing the oxidation state of a group on a molecule. For instance, without limitation, 
a hydroxyl group (-0H) can be oxidized to an aldehyde, ketone or acid. Some 
examples of chemical oxidants are, without limitation, potassium pennanganate, t-butyi 
hypochlorite, m-chloroperbenzoic acid, hydrogen peroxide, sodium hypochlorite, ozone, 

25 peracetic acid, potassium persulfate, and sodium hypobromite. 

[0086] As used herein, a "chemical base" refers to a chemical compound that, in 
aqueous medium, has a pK greater than 7.0. A chemical base may be inorganic or 
organic. Examples of inorganic chemical bases are, without limitation, alkali (sodium, 
potassium, lithium) and alkaline earth (calcium, magnesium, barium) hydroxides, 

30 carbonates, bicarbonates, phosphates and the like. Ammonium hydroxide is another 
inorganic chemical base. Nitrogen-containing organic compounds such as pyridine, 
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aniline, quinoline, morpholine, piperidine and pyrrole are also chemical bases. 

Nitrogen-containing chemical bases may be primary (the nitrogen carries two hydrogen 

atoms and one other substituent, secondary (the nitrogen has one hydrogen and two 

other substituents attached) or tertiary (nitrogen carries no hydrogen atoms and three 
5 other substituents). Chemical bases may be used as aqueous solutions, which may be 

mild (usually due to dilution) or strong (concentrated solutions). A chemical base also 

refers to a strong non-aqueous organic base; examples include, without limitation, 

sodium methoxide, sodium ethoxide and potassium t-butoxide. 

[0087] Secondary amines are presently preferred chemical bases for use in the 
1 0 cleavage of modified nucleotides. Secondary amines useful in the methods of this 

invention include, without limitation, pyrrolidine, piperidine, 3-pynrolidinol, 2- 
Ci pyrrolidinemethanol, 3-pynx)lidlnemethanol, 4-piperidlneethanol, hexamethyleneimine, 
J^'i heptamethyleneimine, diethylamine, diproylamine, dibutylamine, proline, morpholine, 
p piperizine, picolinic acid, piperazine-2-carboxylic acid, 4-piperidineethanol and 
1^ isopecotic acid. A secondary amine useful in the metliods herein may also be polymer 
^ bound, for example without limitation, piperidine-4-carboxylic acid polymine resin 

(polystyrene). 

C3 

P3 [0088] As used herein, the temri "acid" refers to a substance that dissociates in 

^ , 

water to produce one or more hydrogen ions. An acid may be inorganic or organic. It 
20 may be a strong acid, which generally infers highly concentrated, or mild, which 

generally infers dilute. It is, of course, understood that acids inherently have different 
strengths; e.g.. sulfuric acid is much stronger than acetic acid. The proper choice of 
acid will be apparent to those skilled in the art from the disclosures herein. Preferably, 
the acids used in the methods of this invention are mild. Examples of mild inorganic 
25 acids are, without limitation, dilute hydrochloric acid, dilute sulfuric acid, dilute nitric 

acid, phosphoric acid and boric acid. Examples, without limitation, of mild organic acids 
are formic acid, acetic acid, benzoic acid, p-toluenesulfonic acid, trifluoracetic acid, 
naphthoic acid, uric acid and phenol. 

[0089] An "alkyi" group as used herein refers to a 1 to 20 carbon atom straight or 
30 branched chain hydrocarison. Preferably the group consists of a 1 to 10 carbon atom 
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chain; most preferably, it is a 1 to 4 carbon atom cliain. As used herein "1 to 20," etc. 
carbon atoms means 1 or 2 or 3 or 4, etc. up to 20 carbon atoms in the chain. 
[0090] An "alkenyl" group refers to an alky! group, as defined herein, consisting 
of at least two carbon atoms and at least one carbon-carbon double bond. 
5 An "alkynyl" group refers to an alkyi group, as defined herein, consisting of at least two 
carbon atoms and at least one carbon-carbon triple bond. 

[0091] A "cycloalkyl" group refers to a 3 to 8 member all-carbon monocyclic ring. 

an all-carbon 5-member/6-member or 6-member/6-member fused bicyclic ring or a 

multicyclic fused ring (a "lused" ring system means that each ring in the system shares 
to an adjacent pair of carbon atoms with each other ring in the system) group wherein one 
p or more of the rings may contain one or more double bonds but none of the rings has a 
5 completely conjugated pi-electron system. Examples, without limitation, of cycloalkyl 
pi groups are cyclopropane, cyclobutane, cyclopentane, cyclopentene, cyclohexane, 
M cyclohexadlene, adamantane, cycloheptane and, cycloheptatriene. 
■tp [0092] An "aryl" group refers to an all-cari3on monocyclic or fused-ring polycyclic 
P (i.e., rings which share adjacent pairs of carbon atoms) groups having a completely 
fi conjugated pi-electron system. Examples, without limitation, of aryl groups are phenyl, 
U naphthalenyl and anthracenyl. The aryl group may be substituted or unsubstituted. 
W [0093] An "aralkyi" group refers to an aryl group that is substituted with an alkyI 
20 group. As used herein, when an aralkyi group bonds to some other group, bonding 

occurs at the aryl group. 

[0094] An "alkaryl" group refers to an alkyI group that is substituted with an aryl 
group. As used herein, when an alkaryl group bonds to some other group, bonding 
occurs at the alkyI group. 

25 [0095] As used herein, the temris "selective," "selectively," "substantially," 
"essentially," "uniformly" and the like, mean that the indicated event occurs to a 
particular degree. For example, the percent incorporation of a modified nucleotide 
herein is characterized as "substantially complete." As used herein, this means greater 
than 90%, preferably greater than 95% and, most preferably, greater than 99%. With 

30 regard to cleavage at a modified nucleotide, "selectively" means greater than 10 times, 
preferably greater than 25 times, most preferably greater than 100 times that of other 
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natural or modified nucleotide(s) in the modified polynucleotide. The percent cleavage 
at a modified nucleotide is also referred to herein as being "substantially complete," 
This means greater than 90%, preferably greater than 95%, most preferably greater 
than 99% complete. 

5 [0096] As used herein, an "individual" refers to any higher life fomi including 
reptiles, fish, birds and mammals. In particular, the term refers to human beings. 
However, the methods of this invention are useful for the analysis of the nucleic acids of 
any living organism. 
DISCUSSION 

10 [0097] The methods of this invention can be used to examine the genetic DNA of 
£1 an individual displaying symptoms of a particular disease or disorder known or 
\i suspected to be genetically based. Comparison of the DNA of the Individual with that of 

W healthy members of the same population will confirm whether the individual is afflicted 

SI 

Ci, with a particular genetically-relatiBd disease or disorder. The method can also be used 
1p to examine an individual displaying symptoms of a disease or disorder of unknown 
O origin to detemnine if it has a genetic component 

[0098] Particularly useful aspects of the methods described herein are ease of 
assay design, low cost of reagents and suitability of the cleavage products for detection 
W by a variety of methods including, without limitation, electrophoresis, mass spectrometry 
20 and fluorescent detection. 

a. Base-modified nucleotides 

[0099] A base-modified nucleotide refers to a nucleotide having a chemically 
modified adenine, cytosine, guanine or thymine (or, in the case of RNA, uracil). A 
modified polynucleotide is selectively cleavable at the sites of incorporation of the base- 

25 modified nucleotide in comparison to sites of incorporation of natural nucleotides. The 
base-modified nucleotides of this invention are shown in the Summary, above. 
[0100] Cleavage of polynucleotides into which the base-modified nucleotides of 
this invention have been incorporated is accomplished using chemical base. Amine 
chemical bases, such as diethylamine, dipropylamine and pyrrolidine, are presently 

30 preferred chemical bases. Amines having boiling points in excess of about 100"" C at 
atmospheric pressure are particularly preferred. While this includes primary amines 
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with the requisite boiling point, such as, without limitation, 6-hyclroxyhexylamine, 
secondary amines are presently particularly preferred chemical bases. While not being 
bound to any particular theory, it appears that this might be due to the fact that lower 
boiling secondary amines volatize at the relatively high temperatures used to obtain 
5 greater than 90% cleavage, i.e., 80° C or higher, thus making it difficult to maintain an 
optimal concentration of the amine in the cleavage reaction. Examples of higher boiling 
secondary amines include, without limitation, dibutylamine, piperidine, 3-pyrrolidinol, 
hexamethyleneimine, morpholine and pyrazine. Secondary amines having a boiling 
point above 150° C are even more preferable, with those having a boiling point above 
1,0 200° C being the presently most prefenred. Such secondary amines include, without 
CI limitation, heptamethyleneimine, 3-pynx)lidinol, 2-pynx)lidinemethanol, 3- 
1| pyrrolidinemethanol, proline, picolinic acid, piperazine-2-carboxyiic acid, 4- 
S piperidineethanol, isonipecotic acid and piperidine-4-carboxlic acid polymine resin 
(polystyrene). 3-Pyrrolidinol, 2-pyrrolidinemethanol, S-pyn-olidinemethanol and 4- 
^ piperidineethanol are presently preferred high boiling secondary amines for use in the 

H' methods of this invention. 

P& 

CI [0101] When cleavage at a modified base of this invention is earned out in the 

?| presence of a phosphine and a chemical base, a unique adduct forms. For example, 

W when the phosphine is, without limitation, tris(2-carboxyethyl) phosphine (TCEP), mass 

20 spectrometry of the product is consistent with a structure having a ribose-TCEP adduct 
at its 3' end and a phosphate moiety at its 5' end: 



I 

o 



0=P— O" 

I 




coo- 
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[0102] The mechanism of formation of the phosphine adduct is not presently 
known; however, without being held to any particular theory, a possible mechanism is 
the following: 




N [0103] The Incorporation of a phosphine into the cleavage product can be used 
II to label polynucleotide fragments at the same time cleavage is being performed. Thus, 
^ ! by using a phosphine that contains a label or tag but is still capable of fomning the 

above-described adduct, such moieties as, without limitation, mass tags, fluorescence 
10 tags, radioactive tags and ion-trap tags could be incorporated directly into 
polynucleotide fragments during cleavage. 

[0104] While other phosphines useful in the cleavage/tagging procedure 
described above will become apparent to those skilled in the art based on the 
disclosures herein, TCEP is presently prefenred. The carboxyl (-C(O)OH) groups of 
1 5 TCEP can be readily modified, for example, without limitation, by reaction with an 

amine, alcohol or mercaptan in the presence of a carbodiimide to form an amide, ester 
or mercaptoester: 

20 
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Alcohol, amine or thiol (R^M^H) Alcohol, amine or thiol 

HOOC Dicydohexylcarbocliimlde HOOC ^ ^ (R^M^H) ^rAjinc 

^ jr-COQH (DCC) ^ j-Cm'R' DCC ivm^^ ^COM^rI 

g HOOC^ HOOC^ HOOC^ 

TrisK2-cart)c»^thyl)phosphine nrnnomodified derivative bismocBfied derivative 

wherein, and are independently O, NH, NR, S. 

and R^ are mass tags, fluorescent tags, radioactive tags, ion 
trap tags or combinations tliereof. 

10 [01 05] When a carboxyl group is reacted with a cartx)dlimide in the absence of a 
S| nucleophile, the product may rearrange to fomi a stable N-acylurea. If the carbodiimide 
4^ contains a fluorophore, the resultant phosphine will then carry that fluorophore: 

y ■ 

EI 9 NR^ 



r1n=C=NR2+ P[(CH2)2C00H]3 ► [HOOC(CH2)2]2PCc}g-NHI^-| 

o o 

II Jl , 

[HOOC[CH2)2]2PCH(R^ )CNHr^«— 



P [0106] Amino group-containing fluorophores such as fluoresceinyl glycine amide, 

nj 

(5-aminoacetamido)fluorescein, 7-amino-4-methylcoumarin, 2-aminoacridone, 5- 
20 aminofluorescein, 1 -pyrenemethylamlne and 5-aminoeosin may also be used to 

prepare labeled phosphines. Amino derivatives of lucifer yellow and Cascade Blue can 
also be employed, as can amino derivatives of biotin. In addition, hydrazine derivatives 
such as rhodamine and Texas Red hydrazine may be useful in this method. 
Fluorescent diazoalkanes, such as, without limitation, 1-pyrenyldiazomethane, may be 
25 used to fornn esters with TCEP. Fluorescent alkyi halides may also react with the 
carboxylate anion (-C(O)O") of the phosphine to form esters. Such halides include, 
without limitation, panacyi bromide, 3-bromoacetyl-7-diethylaminocoumarin, 6- 
bromoacetyl-2-diethylaminonaphthalene, 5-bromomethylfluorescein, BODIPY® 493/503 
methyl bromide, monobromoblmanes and iodoacetamides such as coumarin 
30 iodoacetamide. Naphthallmide sulfonate ester reacts rapidly with the anions of 
carboxylic acids in acetonitrile to give adducts which are detectable by absorption at 

23 
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259 nm down to 100 femtomoles and by fluorescence at 394 nm down to four 
femtomoles. There are, furthermore, countless amine-reactive fluorescent probes 
known in the art. TCEP can be converted into a primary amine by, for example, the 
following reaction: 



II FDAC 
(CH3)3CO-C— NH(CH2)nNH2 + PKCHzfeCOOHja -^i^!^ 



O O 
(CH3)3CO-C— NH(CH2)nNH-C-CH3)2PDCH2)2COOH)2 ^'"sCOQH^ 



fa H2N(CH2)„NH-C(CH2)2P[(CH2)2COOHl2 

si The aminophosphine can then reacted with an amine-reactive fluorescent probe for use 

H in the cleavage/labeling method described above. 

■ft [01 07] Many other phosphines and methods for appending tags to them will 

F'^ become apparent to those skilled in the art based on the disclosures herein. Such 

CI phosphines, labels and labeling methods are within the scope of this invention, 

f I b. Sugar modification and cleavage 

ass? 

W [01 08] Modification of the sugar portion of a nucleotide may also afford a 
20 modified polynucleotide that is selectively cleavable at the site(s) of incorporation of 
such modified nucleotides. In general, the sugar is modified with one or more 
functional groups that render the 3' and/or the 5' phosphate ester linkage more 
susceptible to cleavage than the 3' or 5' phosphate ester linkage of the con-esponding 
natural nucleotide. The following are examples, without limitation, of modified sugar 
25 nucleotides of this invention. Other sugar modifications will become readily apparent to 
those skilled in the art in light of the disclosures herein and are therefore deemed to be 
within the scope of this invention. 




I I I 



OH N3 
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0 0 0 0 0 0 



m 



b 




"O p 0 P 0 P^-0 s. s "0 — ? — o P — 0 — ? — o- 

i i I and III 

SH 

Base is A, C, G. T, U or I. R is -CN, N3, -SH, -CH2CN, CH2OH or-CH^SH. Cleavage is 
normally accomplished using acid or chemical base. Treatment with a chemical oxidant 
or a reducing agent may be required prior to contact with acid or chemical base. 
10 Presently prefenred acids are dilute inorganic acids, such as, without limitation, dilute 
Q hydrochloric acid, dilute sulfuric acid and phosphoric acid. Relatively mild organic acids 
5 such as, without limitation, acetic acid may also be used. Presently preferred chemical 
bases are dilute inorganic bases such as dilute sodium hydroxide, dilute potassium 
hydroxide and ammonium hydroxide. Non-aqueous bases such as sodium methoxide 
or ethoxide may also be used. The choice of acid or base to use can be readily 
detemnined by those skilled in the art based on the disclosure herein. 

c. Fragment analysis 

[01 09] Analysis of the fragments obtained from the cleavage of a modified 
PJ polynucleotide can be accomplished in a number of ways including, without limitation, 
20 electrophoresis, mass spectrometry, inter- or intra-molecular hybridization and FRET. A 
presently preferred method is mass spectrometry. 

d. Mass Spectrometry 

[0110] Mass spectrometry is a presently preferred analytical tool for the method 
of this invention due to its speed, accuracy, reproducibility, low cost and potential for 

25 automation (fu. D. J., etal.. Nature Biotechnoloav . 1998. 16:381-384). When detection 
of a variance in two or more related polynucleotides is the goal, the ability of mass 
spectrometry to differentiate masses within a few, even one, atomic mass unit (amu) 
permits such detection without the need for determining the complete nucleotide 
sequences of the polynucleotides being compared. The required information is obtained 

30 from the masses of the fragments. 
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[0111] Mass spectrometric identification of a variance depends on the unique 
masses of tiie four deoxynucleotides and their oligomers. Table 1 shows the mass 
differences among the fourdeoxynucleotide monophosphates. In Panel A, the masses 
of the four deoxynucleotlde residues are shown across the top, and calculated 
5 molecular weight differences between each pair of nucleotide residues are shown in the 
table. It is understood that the base-modified nucleotides of this invention will have 
different masses than those shown above for the natural nucleotides. Thus the mass 
differences will also be different. In general, the mass difference between a base- 
modified nucleotide and the natural nucleotides in Table 1 will be larger, which should 
to improve mass spec analysis. For example, in Panel B the mass differences between 
CI the natural nucleotides and 2-chloroadenine are shown (far right column). The smallest 
J mass difference is 17.3 Da instead of 9 Da as in panel A, providing a greater degree of 
% discrimination between nucleotides using mass spectrometry. 
1=6 Table 1 



Panel A 


dAMP 


dCMP 


dGMP 


dTMP 




Mol. Wt. 


313.2 


289.2 


329.2 


304.2 




vs. dAMP 




24 


16 


9 




vs. dCMP 






40 


15 




vs. dGMP 








25 




Panel B 


dAMP 


dCMP 


dGMP 


dTMP 


2-chloro- 
adenineMP 


Mol.wt. 


313.2 


289.2 


329.2 


304.2 


347.7 


vs. dTMP 










42.3 


vs. dAMP 




24 


16 


9 




vs. dCMP 






40 


15 


57.3 


vs. dGMP 








25 


17.3 



15 

[01 12] Table 2 shows the calculated masses of all possible 2-mers, 3-mers, 4- 
mers and 5-mers. As can be seen, only two of the 121 oligomers have the same mass. 
Thus, the nucleotide composition of all 2mers, Smers, 4mers and all but two 5mers 
created by cleavage of a polynucleotide can be Immediately determined by mass 
20 spectrometry, if the instrument has sufficient resolving power. Given the masses in 
Table 2, an instrument with a resolution (full width at half-maximal height) of 1500 to 
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2000 would be sufficient. Mass spectrometers with resolution up to 10,000 are 
commercially available. However, when cleavage is performed at all sites of modified 
nucleotide Incorporation, it is not necessary to consider the masses of all possible 
2mers, 3mers, 4mers, etc. because there can be no internal occurrence of the cleavage 
nucleotide In any fragment. For example, if a modified G (mod G) is the cleavage 
nucleotide, then all resulting cleavage fragments will have 0 or 1 mod G, depending on 
retention or loss of mod G in the fragments. If mod G is retained, it must occur at either 
the 3' or the 5' end of the fragment. Thus, if the cleavage chemistry leaves a mod G on 
either end of all fragments, then the mass of mod G can be subtracted from the mass of 
each fragment 
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TABLE 2 



2mer 


mass 




3 mer 


IVIass 




4mer 


mass 




5nnGr 


mass 


CC 


596 


CCG 


885 


ccc 


1174 




1400 


CT 


611 


CCT 


900 


CCCT 


1189 




CCCCT 


A A 

1478 


AC 


620 


CCA 


909 


CCCA 


1198 


CCCCA 


14o7 


TT 


626 


CTT 


915 


CCTT 


1204 


CCCTT 


A A OO 

1493 


AT 


635 


CTA 


924 


CCTA 


1213 


CCCTA 




CG 


636 


CCG 


925 


CCCG 


1214 


CCCCG 


1503 


AA 


644 


TTT 


930 


CTTT 


1219 


CCTTT 


loOo 


GT 


651 


CAA 


933 


CCAA 


1222 


CCCAA 


A CA A 

1511 


AG 


660 


TTA 


939 


CTTA 


1228 


CCTTA 


A ^ A'7 

1517 


GG 


676 


CTG 


940 


CCTG 


1229 


GCCTG 


A A Q 

1518 






TAA 


948 


1 1 1 1 


1234 


Ui 1 1 1 


1523 




CGA 


949 


CAAT 


1237 


CCTAA 


1526 


TTG 


955 


CCAG 


1238 


CCCGA 


1527 


AAA 


957 


TTTA 


1243 


CTTTA 


1532 


TGA 


964 


CTTG 


1244 


CCTTG 


1533 


CGG 


965 


CAAA 


1246 


/^/^ AAA 

CCAAA 


A ac 

1535 


AAG 


973 


TTAA 


1252 


1 1 1 t 1 


A COO 


TGG 


980 


CTAG 


1253 


^ 1 1 A A 

GTTAA 


A C A A 

1541 


GGA 


989 


CCGG 


1254 


CCTGA 


1542 


GG6 


1005 


TTTG 


1259 


CCCGG 


A C Afi 

1543 








TAAA 


1261 


1 M lA 


A C A'7 

1547 




CAAG 


1262 


CTTTG 


A C AO 

1548 


Continued from right 


TTAG 


1268 


A ATA 

CAATA 


looO 


5mer 


mass 




CTGG 


1269 


CCAGA 


A CCA 

1551 


TTTGG 


1588-* 






AAAA 


1270 


TTTAA 


1556 


TAAAG 


1590 


1 1 


TAAG 


1277 


CTTGA 


1557 


CAAGG 


1591 


CAGG 


1278 


CCTGG 


1558 


ATTGG 


1597 


TTGG 


1284 


CAAAA 


1559 


CTGGG 


1598 


AAAG 


1286 


1 1 1 lU 


1563 


AAAAG 


1599 


TAGG 


1293 


TTAAA 


1565 


TAAGG 


1606 


CGGG 


1294 


CTAGA 


1566 


ACGGG 


1607 


AAGG 


1302 


CCGGA 


1567 


TTGGG 


1613 


TGGG 


1309 




TTTGA 


1572 


AAAGG 


1615 


AGGG 


1318 




CTTGG 


1573 


ATGGG 


1622 


GGGG 


1334 




TAAAA 


1574 


CGGGG 


1623 








CAAAG 


1575 


AAGGG 


1631 




TTAAG 


1581 


TGGGG 


1638 


CTGGA 


1582 


AGGGG 


1647 






AAAAA 


1583 


GGGGG 


1663 








CCGGG 


1583 



and the resulting masses can be compared. The same is, of course, true of A, C and T. 
Table 3 shows the masses of all 2mers through 7mers lacking one nucleotide. From 
Table 3, it can be seen that cleavage at A or T consistently produces fragments with 
larger mass differences between the closest possible cleavage fragments. Cleavage at 
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A produces mass differences of 5, 10, 15, 20 or 25 Da between the closest fragments 
while cleavage at T affords mass differences of 8, 18 or 24 Da, albeit at the expense of 
creation of a few more isobaric fragments. 

[01 1 31 It has been found that, of all oligonucleotides up to the SOmers, only 8 
sets of isobaric oligonucleotides (oligonucleotides having masses within 0.01% of each 
other) exist. These are shown in Table 4. Inspection of Table 4 reveals that every set 
except 
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CiMvaneatG CleavaaeatC Cleavage at A Cleavage at T 


2mer 


mass 


mass A 




2mer 


Mass 


mass A 




2mer 


mass 


mass A 




2mer 


Mass 


mass A 




^51 7 






n 


547 






CC 


517 






CC 


517 




f*T 
Ul 


00a. 


15 




AT 


556 


9 




CT 


532 


15 




AC 


541 


24 






9 




AA 


565 


9 




n 


547 


15 




CG 


557 


16 


TT 
1 i 


'SIT 


6 




GT 


572 


7 




CG 


557 


10 




AA 


565 


8 


AT 




9 




AG 


581 


9 




GT 


572 


15 




AG 


581 


16 


AA 

MM 




9 




CG 


597 


16 




CG 


597 


25 




CG 


597 


16 








oiiici 


mace 


mass A 




3mer 


Mass 


mass A 




3mer 


mass 


mass A 




3mer 


mass 


mass A 




one 
QUO 




TTT 


851 




CCC 


806 




CCC 


806 






flO't 


4C 


HA 


860 


9 


CCT 


821 


15 


CCA 


830 


24 






9 


TAA 


869 


9 


cn 


836 


15 


CCG 


846 


16 


m 
ul 1 


OOQ 


D 


HG 


876 


7 


CCG 


846 


10 


CAA 


854 


8 


PTA 


0*13 


9 


AAA 


878 


2 


rrr 


851 


5 


CGA 


870 


16 


TTT 
1 1 1 


oDl 


6 


TGA 


885 


7 


CTG 


861 


10 


AAA 


878 


8 






3 


AAG 


894 


9 


TTG 


876 


15 


CGG 


886 


8 


TTA 
1 lA 


oOU 


6 


TGG 


901 


7 


CGG 


886 


10 


AAG 


894 


8 


TAA 
1 AA 


oD9 


9 


GGA 


910 


9 


TGG 


901 


15 


GGA 


910 


16 


AAA 
AAA 




9 


GGG 


926 


16 


GGG 


926 


25 


GGG 


926 


16 




















j— J 1 


mass 


mass A 




4mer 


mass 


mass A 




4mer 


mass 


mass A 




4mer 


mass 


mass A 




lust) 




mrr 


1155 




CCCC 


1095 




CCCC 


1095 




CCCT 


1110 


15 


TTTA 


1164 


9 


CCCT 


1110 


15 


CCCA 


1119 


24 


COCA 


1119 


9 


HAA 


1173 


9 


ccn 


1125 


15 


CCCG 


1135 


16 


CCTT 


1125 


6 


TTTG 


1180 




CCCG 


1135 


10 


CCAA 


1143 


8 


CCTA 


1134 


9 


TAAA 


1182 




crrr 


1140 


5 


CCAG 


1159 


16 


CTTT 


1140 


6 


HAG 


1189 




CCTG 


1150 


10 


CAAA 


1167 


8 


CCAA 


1143 


3 


AAAA 


1191 




TTTT 


1155 


5 


CCGG 


1175 


8 


CTTA 


ii4y 


6 


TAAG 


1198 




CHG 


1165 


10 


CAAG 


1183 


8 


INI 


1155 


6 


HGG 


1205 




CCGG 


1175 


10 


AAAA 


1191 


8 


CAAT 


4 4 CO 
1158 


3 


AAAG 


1207 




TTTG 


1180 


5 


CAGG 


1199 


8 


TTT A 

TTTA 


1164 


6 


TAGG 


1214 




CTGG 


1190 


10 


AAAG 


1207 


8 


A A 

uAAA 


4 4 £7 


3 


AAGG 


1223 


9 


HGG 


1205 


15 


CGGG 


1215 


8 


TTA A 

1 lAA 


447'} 
Jlf 0 


6 


TGGG 


1230 


7 


CGGG 


1215 


10 


AAGG 


1223 


8 


TA A A 

TAAA 


4 4QO 


9 


AGGG 


1239 


9 


TGGG 


1230 


15 


AGGG 


1239 


16 


A A A A 

AAAA 


lltf 1 


9 


GGGG 


1255 


16 


GGGG 


1255 


25 


GGGG 


1255 


16 




















omer 


mass 


mass A 




5mer 


mass 


mass A 




5mer 


mass 


mass A 




5mer 


mass 


mass A 


^^^^^ 

CCCCC 


1384 




TTTTT 


1459 




CCCCC 


1384 




CCCCC 


1384 




CCCCT 


1399 


15 


TTTTA 


1468 


9 


CCCCT 


1399 


15 


CCCCA 


1408 


24 


CCCCA 


1408 


9 


TTTAA 


1477 


9 


cccn 


1414 


15 


CCCCG 


1424 


16 


CCCTT 


1414 


6 


TTTTG 


1484 


7 


CCCCG 


1424 


10 


CCCAA 


1432 


8 


CCCTA 


1423 


9 


HAAA 


1486 


2 


CCTH 


1429 


5 


CCCGA 


1448 


16 


CCTTT 


1429 


6 


TTTGA 


1493 


7 


CCCTG 


1439 


10 


CCAAA 


1456 


8 


CCCAA 


1432 


3 


TAAAA 


1495 


2 


crrrr 


1444 


5 




CCCGG 


1464 


8 


CCTTA 


1438 


6 


HAAG 


1502 


7 


CCHG 


1454 


10 




CCAGA 


1472 


8 


CTTTT 


1444 


6 




AAAAA 


1504 


2 




TTTTT 


1459 


5 




CAAAA 


1480 


8 


CGTAA 


1447 


3 




rrTGG 


1509 


5 




CCCGG 


1464 


5 




CCGGA 


1488 


8 


CTFTA 


1453 


6 




TAAAG 


1511 


2 




CTHG 


1469 


5 




CAAAG 


1496 


8 


CCAAA 


1456 


3 




AHGG 


1518 


7 




CCTGG 


1479 


10 




AAAAA 


1504 


8 


mil 


1459 


3 




AAAAG 


1520 


2 




TTTTG 


1484 


5 




CCGGG 


1504 


0 


CHAA 


1462 


3 




TAAGG 


1527 


7 




CHGG 


1494 


10 




CAAGG 


1512 


8 


TTTTA 


1468 


6 




HGGG 


1534 


7 




CCGGG 


1504 


10 




AAAAG 


1520 


8 


CAATA 


1471 


3 




AAAGG 


1536 


2 




TTTGG 


1509 


5 




ACGGG 


1528 


8 


riTAA 


1477 


6 




ATGGG 


1543 


7 




CTGGG 


1519 


10 




AAAGG 


1536 


8 


CAAAA 


1480 


3 




AAGGG 


1552 


9 




HGGG 


1534 


15 




CGGGG 


1544 


8 


HAAA 


1486 


6 




TGGGG 


1559 


7 




CGGGG 


1544 


10 




AAGGG 


1552 


8 


TAAAA 


14 

95 


9 




AGGGG 


1568 


9 




TGGGG 


1559 


15 




AGGGG 


1568 


16 




1 9 




GGGGG 


1584 


16 




GGGGG 


1584 


25 




GGGGG 


1584 


16 
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Cleavage at G 


Cleavage ate 




6iiier 


mass 


mass A 




6mer 


mass 


mass A 


www 


1673 






llllll 


1763 




CCCCCT 

www 1 


1688 


15 




TTTTTA 


1772 


9 


CCCCCA 

W WW w wxl 


1697 


9 




TTTTAA 


1781 


9 


CCCCTT 


1703 


6 




TTTTTG 


1788 


7 


CCCCTA 

WW WW 1 


1712 


9 




TTTAAA 


1790 


2 


CCCTTT 


1718 


6 




TTTTAG 


1797 


7 




1721 


3 




TTAAAA 


1799 


2 


CCCTTA 


1727 


6 




TTTAAG 


1806 


7 


CC\ 1 1 1 

vwl 1 1 1 


1733 


5 




TAAAAA 


1808 


2 


CCCTAA 

www 1 nn 


1736 


3 




TTTTGG 


1813 


5 


CCTTTA 


1742 


$ 




TTAAAG 


1815 


2 


CCCAAA 


1745 


3 




AAAAAA 


1817 


2 


TTTTTC 


1748 


3 




TTTGGA 


1822 


5 


CCTTAA 

WW 1 1 


1751 


3 




AAAAGT 


1824 


2 


CTTTTA 


1757 


6 




TTAAGG 


1831 


7 


CCAAAT 


1760 


3 




AAAAAG 


1833 


2 


III III 


1763 


3 




nTGGG 


1838 


5 


CTTTAA 


1766 


3 




AAAGGT 


1840 


2 


CCAAAA 


1769 


3 




ATTGGG 


1847 


7 


TTTTTA 


1772 


3 




AAAAGG 


1849 


2 


CTTAAA 


1775 


3 




TAAGGG 


1856 


7 


TTTTAA 


1781 


6 




HGGGG 


1863 


7 


TAAAAC 


1784 


3 




AAAGGG 


1865 


2 


TTTAAA 


1790 


6 




AGGGGT 


1872 


7 


CAAAAA 


1793 


3 




AAGGGG 


1881 


9 


TTAAAA 


1799 


6 




GGGGGT 


1888 


7 


TAAAAA 


1808 


9 




AGGGGG 


1897 


9 


AAAAAA 


1817 


9 




GGGGGG 


1913 


16 














7iner 


mass 


mass A 




7mer 


mass 


mass A 


ccccccc 

Www WW WW 


1962 






TTTTTTI 


2067 




crcccci 


1977 


15 




TTTTTTA 


2076 


9 




1986 


9 




TTTTTAA 


2085 


9 




1992 


6 




TTTTTTG 


2092 


7 




£MM 1 


9 




TTTTAAA 

till rWi 


2094 


2 




9007 


6 




TTTTTGA 


2101 


7 




^10 


3 




TTTAAAA 

III nrvm 


2103 


2 




901 


6 




TTTTAAG 


2110 


7 


rm III 


9079 


6 




TTAAAAA 


2112 


2 


Ul/UU 1 MM 


909'i 


3 




GGTTTTT 


2117 


5 


CCCTTTA 

w W 1 1 1 n 


2031 


6 




TTTAAAG 


2119 


2 


CCCCAAA 


2034 


3 




TAAAAAA 


2121 


2 


PPT 1 1 1 1 

1 1 1 1 1 


9037 


3 




TTTTGGA 


2126 


5 


www 1 1 r\r\ 


2040 


3 




TTAAAGA 


2128 


2 


PriTTTA 

1 1 1 In 


2046 


6 




AAAAAAA 


2130 


2 




9nAQ 


3 




TTTGGAA 

III WW/V\ 


2135 


5 


Ul 1 1 1 1 1 


90')9 


3 




AAAAAGT 

/www 1 


2137 


2 


Ul/ 1 1 i MM 


9A'\<t 


3 




GGGTTTT 

www 1 1 1 1 


2142 


5 




9A'U{ 


3 




TTAAAGG 

i IMMMww 


2144 


2 


TTTTTPA 
1 H 1 1 U A 


ZUOI 


3 




AAAAAAn 
MMMMMMw 


91ig 


2 


/*/*TTA AA 
UUI lAAA 


ZU04 


3 






2151 


5 


1 1 i 1 1 i 1 


OAC7 


3 




AAAAftftT 


91*53 


2 


TTTTA A/* 


ZU/U 


3 




MM 1 1 Ul3w 


91f)0 


7 


TA AAAPP 

lAAAAv/v 


9071 


3 




AAAAAGG 


9169 


2 


Al 1 1 1 1 1 


on7A 

ZUf 0 


3 




ftRftGTTT 

1 1 1 


2167 


5 


TTTA A Al^ 

TI lAAAU 




3 




TAAAf^r^G 




2 


CCAAAAA 


2082 


3 




TTGGGGA 


2176 


7 


AATTTTT 


2085 


3 




AAAAGGG 


2178 


2 


CTTAAAA 


2088 


3 




AAGGGGT 


2185 


7 


AAATTTT 


2094 


6 




GGGGGH 


2192 


7 


CTAAAAA 


2097 


3 




AAAGGGG 


2194 


2 


AAAATH 


2103 


6 




AGGGGGT 


2201 


7 


CAAAAAA 


2106 


3 




AAGGGGG 


2210 


9 


AAAAAH 


2112 


6 




GGGGGGT 


2217 


7 


AAAAAAT 


2121 


9 




AGGGGGG 


2226 


9 


AAAAAA 


2130 


9 




GGGGGG 


2242 


16 



Cleavage at A Cleavage at T 



6mer 


mass 


mass A 




6mer 


mass 


mass A 


CCCCCC 


1673 






CCCCCC 


1673 




CCCCCT 


1688 


15 




CCCCCA 


1697 


24 


ccccn 


1703 


15 




CCCCCG 


1713 


16 


CCCCCG 


1713 


10 




CCCCAA 


1721 


8 


CCCTTT 


1718 


5 




CCCCAG 


1737 


16 


CCCCTG 


1728 


10 




CCCAAA 


1745 


8 


CCTTTT 


1733 


5 




CCCCGG 


1753 


8 


CCCHG 


1743 


10 




CCCAAG 


1761 


8 


TTTTTC 


1748 


5 




CCAAAA 


1769 


8 


CCCCGG 


1753 


5 




CCCGGA 


1777 


8 


CCTHG 


1758 


5 




CCAAAG 


1785 


8 


llllll 


1763 


5 




CCCGGG 


1793 


8 


CCCTGG 


1768 


5 




CAAAAA 


1793 


0 


rrrrcG 


1773 


5 




CCAAGG 


1801 


8 


CCHGG 


1783 


10 




CAAAAG 


1809 


8 


rmTG 


1788 


5 




CCGGGA 


1817 


8 


CCCGGG 


1793 


5 




AAAAAA 


1817 


0 


TTTCGG 


1798 


5 




AAACGG 


1825 


8 


CCTGGG 


1808 


10 




AAAAAG 


1833 


8 


rrrTGG 


1813 


5 




CCGGGG 


1833 


0 


nCGGG 


1823 


10 




AACGGG 


1841 


8 


CCGGGG 


1833 


10 




AAAAGG 


1849 


8 


TTTGGG 


1838 


5 




ACGGGG 


1857 


8 


TGGGGC 


1848 


10 




AAAGGG 


1865 


8 


HGGGG 


1863 


15 




GGGGGC 


1873 


8 


GGGGGC 


1873 


10 




AAGGGG 


1881 


8 


GGGGGT 


1888 


15 




AGGGGG 


1897 


16 


GGGGGG 


1913 


25 




GGGGGG 


1913 


16 



7mer 


mass 


mass A 




7mer 


mass 


mass A 


CCCCCCC 


1962 






CCCCCCC 


1962 




CCCCCCT 


1977 


15 




CCCCCCA 


1986 


24 


cccccn 


1992 


15 




CCCCCCG 


2002 


16 


CCCCCCG 


2002 


10 




CCCCCAA 


2010 


8 


CCCCTH 


2007 


5 




CCCCCGA 


2026 


16 


CCCCCTG 


2017 


10 




CCCCAAA 


2034 


8 


cccmrr 


2022 


5 




CCCCCGG 


2042 


8 


CCCCHG 


2032 


10 




CCCCAAG 


2050 


8 


CCIIIll 


2037 


5 




CCCAAAA 


2058 


8 


CCCCCGG 


2042 


5 




CCCCGGA 


2066 


8 


CCCTTTG 


2047 


5 




CCCAAAG 


2074 


8 


CTTTTTT 


2052 


5 




CCAAAAA 


2082 


8 


CCCCTGG 


2057 


5 




CCCCGGG 


2082 


0 


CCTTTTG 


2062 


5 




CCCGGAA 


2090 


8 


1111III 


2067 


5 




CCAAAAG 


2098 


8 


CCCHGG 


2072 


5 




CCCGGGA 


2106 


8 


CTTTTTG 


2077 


5 




CAAAAAA 


2106 


0 


CCCCGGG 


2082 


5 




CCAAAGG 


2114 


8 


CTTTCGG 


2087 


5 




CAAAAAG 


2122 


8 


Glliltr 


2092 


5 




CCCGGGG 


2122 


0 


CCCTGGG 


2097 


5 




CCGGGAA 


2130 


8 


cmrTGG 


2102 


5 




AAAAAAA 


2130 


0 


CCHGGG 


2112 


10 




AAAACGG 


2138 


8 


GGIIIII 


2117 


5 




AAAAAAG 


2146 


8 


CCCGGGG 


2122 


5 




CCGGGGA 


2146 


0 


crrrGGG 


2127 


5 




AAACGGG 


2154 


8 


TGGGGCC 


2137 


10 




AAAAAGG 


2162 


8 


GGGTTn 


2142 


5 




CCGGGGG 


2162 


0 


CHGGGG 


2152 


10 




AACGGGG 


2170 


8 


GGGGGCC 


2162 


10 




AAAAGGG 


2178 


8 


GGGGTH 


2167 


5 




AGGGGGC 


2186 


8 


GGGGGTC 


2177 


10 




AAAGGGG 


2194 


8 


GGGGGH 


2192 


15 




CGGGGGG 


2202 


8 


CGGGGGG 


2202 


10 




AAGGGGG 


2210 


8 


GGGGGGT 


2217 


15 




AGGGGGG 


2226 


16 


GGGGGGG 


2242 


25 




GGGGGGG 


2242 


16 
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TABLE 4 





Polynucleotides 


Masses 


Setl 


d(QG3) 


1566.016 




dfAO 


1566.068 


Set 2 


d(CA) 


2433.584 




d(T«) 


2433.603 




d (CA,) 


2433.636 


Sets 


d (AiG,) 


2617.707 




d(C,T,) 


2617.711 


Set 4 


d (CM 


3196.090 




d(G,n) 


3196.137 


Sets 


d (C J A) 


3292.134 




d(C„) 


3292.190 


Set 6 


d(Ci3) 


3759.45 / 




drr,A.a) 


3759.472 


Set? 


d(C5T,) 


4183.751 




d(A,G,) 


4183.779 


Sets 


d(TA) 


4433.899 




d(C„A.) 


4433.936 



Set 2 involves a polynucleotide with multiple G residues. Thus, cleavage at mod G 
would eliminate all isobaric masses except one, dCTg) vs dCCgAg) which could not be 
resolved by mass spectrometry with a resolution of 0.01%. However, cleavage at either 
a mod C or a mod A would resolve the matter. 

[01 14] For a polynucleotide of known sequence, one can easily predict whether 
cleavage at a particular nucleotide would produce any of the above confounding 
artifacts and then choose experimental conditions that avoid, reduce or resolve them. 
[0115] Table 5 shows the sets of mass changes expected on complementary 
strands for all possible point mutations (transitions and transvereions). Whether a 
particular variance is an addition of a nucleotide (approximately 300+ a.u. increase in 
fragment mass), a deletion of a nucleotide (approximately a 300+ a.u. decrease in 
fragment mass) or a substitution of one nucleotide for another can easily be 
ascertained. Furthemnore, if the variance is a substitution, the exact nature of that 
substitution can also be detennined. 
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Table 5 







Resolving Power of MS Instrument (FWHM) 


Nucleotide 
substitution 


A (Da) 


1,000 


1,500 


2,000 


10,000 


Maximum fragment in which A at left is res 


solvable 


C<->G 


40 


123 nt 


184 nt 


246 nt 


1,230 


G<->T 


25 


77 nt 


116nt 


154 nt 


770 


A<->C 


24 


74 nt 


Hint 


148 nt 


740 


A<->G 


16 


49 nt 


74 nt 


98 nt 


490 


C<->T 


15 


46 nt 


69 nt 


92 nt 


460 


A<->T 


9 


27 nt 


41 nt 


55 nt 


270 



[01 16] Table 5 also summarizes the relation between mass spectrometer 
resolution and nucleotide changes in detemiining the maximum size fragment in which 
a given base change can be identified. The maximum size DNA fragment in which a 
base substitution can theoretically be resolved is provided in the four columns on the 
right for each possible nucleotide substitution. The mass difference created by each 
substitution (A, measured in Daltons, Da) and the resolving power of the mass 
spectrometer determine the size limit of fragments that can be successfully analyzed. 
Presently available commercial MALDI instruments can resolve between 1 part in 1,000 
to 1 part In 5,000 while available ESI instruments can resolve 1 part in 10,000. 
Modified ESI instruments are capable of at least 10-fold greater mass resolution, 
e. Cleavage resistant modified nucleotides 

[01 1 7] The preceding embodiments of this invention relate primarily to the 
substitution into a polynucleotide of one or more modified nucleotides, which enhance 
the susceptibility of the polynucleotide to cleavage at the site(s) of incorporation. It is 
also an aspect of this invention to incorporate a combination of cleavage-resistant and 
cleavage-sensitive modified nucleotides into a polynucleotide to further enhance 
selectivity. An example of a modified nucleotide which imparts cleavage resistance is 
the 2'-fluoro derivative, which has been shown to be substantially less susceptible to 
fragmentation in a mass spectrometer than the con-esponding unsubstituted natural 
nucleotide. 
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APPLICATIONS 

[0118] A number of applications of the methods of the present invention are 
described below. It is understood that the following are exemplary only and are not 
intended, nor should they be constmed, to limit the scope of this invention in any 
5 manner whatsoever. Other applications of the methods described herein will become 
apparent to those skilled in the art based on the disclosures herein. Such applications 
are within the scope of this invention, 
a. Variance detection 

[01 19] In one aspect of the present invention at least one natural nucleotide is 
1 0 replaced at substantially each point of occunrence in a polynucleotide with a base- 

modified nucleotide. This is accomplished by either primer extension, if one strand is 
Ci being used, or amplification if two strands are being used. The resultant modified 
S polynucleotide is treated with a reagent or combination of reagents that cleaves it at 
^ substantially each base-modified nucleotide. Under this protocol, if the abundance of 
tj^ A, C, G and T were equal in naturally-occuning polynucleotides and if their distribution 
% were entirely random, then the fragments obtained would average 4 nucleotides. In 
W actuality, there is considerable deviation in the size of fragments due to the non-random 
05 distribution of nucleotides In biological polynucleotides and the unequal amounts of A:T 
S vs. G:C base pairs in different genomes. Furthennore, the modified polynucleotide will 
20 not be cleaved until the first occurrence of a modified nucleotide after the end of the 
primer. Thus, one fragment (if single strand primer extension is used, two if 
amplification is used) will contain all the primer nucleotides plus those of the modified 
polynucleotide up to the point of incorporation of the first modified nucleotide. Often, 
these primer-containing fragments will be the largest produced. This can be 
25 advantageous In the design of genotyping assays. That is, primers can be designed so 
that a suspected polymorphic locus is the first occurrence of a modified nucleotide 
corresponding to one of a pair of SNP nucleotides after the end of the primer. Thus, 
only the primer-containing fragment must be analyzed to determine the genotype. 
[0120] Due to the variation in length of fi-agments that will be created on 
30 cleavage, a mass spectrometer must be capable of detecting the masses of 

oligonucleotides up to 20mer8 or even 30mers. To match the expected fragment sizes 
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to the capabilities of tlie mass spectrometer being used, it is desirable to select an 
optimal modified nucleotide substitution/cleavage scheme for each polynucleotide 
sequence that is to be analyzed. One method for accomplishing this is the following: 
[0121] (a) For each nucleotide at each position in a test polynucleotide, 
5 substitute each of the other three nucleotides. For example, If position 1 of the test 
polynucleotide is an A, hypothetical polynucleotides having T, G and C at position 1 are 
generated. The same is done for each nucleotide in the polynucleotide. Thus, if the 
test polynucleotide is 100 nucleotides in length, 300 new hypothetical polynucleotides 
will be generated if only one strand is being used. If two strands are involved, then 
1,0 another 300 polynucleotides will be generated from the complementary strand. 
O [01 22] (b) Generate the masses that would be produced by cleaving at A in 
1| the original (reference) polynucleotide and at T, C, G in each of the three new 
W hypothetical polynucleotides obtained by the substitutions of T, C or G for A at position 

1 . For each of the four cleavages (T, C, G, A), determine whether the disappearance of 
^ an existing mass or the generation of a new mass would create a difference in the total 
S set of masses. If a difference Is created, determine whether it is a single difference or 
C3 two differences (i.e. a disappearance of one mass and an appearance of another). 
m Also, detennlne the magnitude of the mass difference compared to that of the set of 

masses generated by cleavage of the reference sequence. Perfomn this same analysis 
20 for each of the 1 00 nucleotides of the original polynucleotide. 

[0123] (c) Generate a correlation score for each of the four base-specific 
cleavages. The correlation score increases In proportion to the fraction of the 300 
deviations from the reference sequence that produces one or more mass changes (with 
a higher correlation score being given for two mass differences). The correlation score 
25 will also be proportional to the size of the mass differences (larger mass differences 
score higher). 

[0124] In the case of primer extension, the analysis is performed for one strand; 
in the case of amplification, it is carried out for both strands. The method can be 
extended to combinations of substitution and cleavage. For example, T cleavage on 
30 each of the strands of a polynucleotide or cleavage at T and A on one strand (in either 
case the cleavage may be earned out Independently or simultaneously on the two 
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strands) or cleavage of one strand at T and the other at A. Based on the correlation 
scores for each of the different approaches, an optimal substitution/cleavage scheme 
for the available instrument can be determined in advance of experimental work. 
[0125] The above procedure is readily computerized. Furthermore, the program 
set up to detennine the best experimental protocol can also be programmed to perfomi 
the comparison of experimental cleavage masses obtained with the hypothetical 
results, which constitute all possible cleavage masses. That is, the program can be 
constructed to compare all the masses in the experimentally determined mass 
spectrum with the cleavage masses expected from cleavage of the reference sequence 
and to flag any new or missing masses. If there are new or missing masses, the 
experimental set of masses can be compared with the masses generated in the 
computational analysis of all the possible nucleotide substitutions, insertions or 
deletions associated with the experimental cleavage conditions. However, nucleotide 
substitutions are about ten times more common than insertions or deletions, so an 
analysis of substitutions alone might suffice. The computational analysis data for all 
possible nucleotide insertions, deletions and substitutions can be stored in a look-up 
table. The set of computational masses that matches the experimental data then 
provides the sequence of the new variant sequence or, at a minimum, the restricted set 
of possible sequences of the new variant sequence. (The location and chemical nature 
of a substitution may not be uniquely specified by one cleavage experiment.) To 
resolve all ambiguity conceming the nucleotide sequence of a variant sample may 
require, in some cases, another substitution and cleavage experiment or may be 
resolved by some other sequencing method (e.g. conventional sequencing methods or 
sequencing by hybridization). It may be advantageous to routinely perform multiple 
different substitution and cleavage experiments on all samples to maximize the fraction 
of variances, which then can be precisely assigned. 

[0126] A computational analysis of natural polynucleotides of 50, 1 00, 1 50, 200 
and 250 nucleotides has revealed that combinations of two nucleotide cleavages (for 
example cleave at A on one strand and G on the complementary strand) result in 99- 
100% detection efficiency, considering all possible substitutions up to 250 nt. Useful 
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data might even be obtainable from fragments up to 1000 nt, althougli tlie detection 
efficiency would in most cases be less than 100%. 
b. Genotyping 

[0127] As DNA sequence data accumulates more and more variances in the 
genetic code for individuals compared to the general population within a species are 
being recognized. Some of these variances are being related to phenotypic 
differences, such as an increased susceptibility to a particular disease of a different 
reaction to a given therapeutic regime. Thus, there is increasing demand for accurate, 
high throughput, automatable and inexpensive methods for detennining the status of a 
specific nucleotide or nucleotides in a gene in which a variance between individuals has 
been discovered. This procedure - the detemiination of the nucleotide at a particular 
location in a DNA sequence - is refen-ed to as genotyping. The methods of this 
invention are well suited to genotyping. First, a segment of DNA in which a variance is 
known to occur in some individuals is replicated to produce enough of the segment to 
'yd work with. This can be accomplished by primer extension or by amplification. 
Amplification by PGR is presently preferred. The amplification is performed in the 
presence of three natural nucleotides and one base-modified nucleotide. The base- 
modified nucleotide can con-espond to one of the nucleotides giving rise to the variance 
or It can correspond to a nucleotide that flanks the variable position. The latter 
20 approach can in some cases be advantageous because the primer sequence is then 
never a part of the fragment that contains the polymorphism. This has the advantage of 
producing low molecular weight fragmente, which, in turn, results in more efficient 
desorptlon in MALDI mass spec and a larger signal than those obtained from larger 
fragments. In addition, the higher peaks might allow for enhanced automated calling of 
25 variances. However, depending on the length of the fragments predicted by using a 
flanking nucleotide, it might be more advantageous to use a modified nucleotide that 
corresponds to one of the nucleotides at the polymorphic site. For example. If an A/T 
polymorphism is to be genotyped, the cleavable nucleotide could be either A or T. If a 
G/A polymorphism is to be genotyped the cleavable nucleotide could be either A or G. 
30 Conversely the assay could be set up for the complementary strand, where T and C 
occur opposite A and G. The polymerization product is chemically cleaved by treatment 
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with acid, base or other reagent. If the alleles being studied are heterozygous, two 
products will be obtained with one being longer than the other as a result of the 
presence of the cleavable nucleotide at the polymorphic site in one allele but not the 
other. A mass change, but not a length change, also occurs on the opposite strand. 
5 One constraint is that one of the primers used for producing the polynucleotide must be 
located such that the first occunrence of the cleavable nucleotide after the end of the 
primer is at the polymorphic site. This usually requires one of the primers to be close to 
the polymorphic site. An alternative method is to simultaneously incorporate two 
cleavable nucleotides, one for a polymorphic nucleotide on the (+) strand, one for a 
1 0 polymorphic nucleotide on the (-) strand. For example, one might incorporate cleavable 

dA on the (+) strand (to detect an A-G polymorphism) and cleavable dC on the (-) 
Q strand (to positively detect the presence of the G allele on the (+) strand). In this case, 
^ it may be advantageous to have both primers close to the variant site. The two allelic 
S products of different size can be analyzed by, without limitation, electrophoresis, mass 
•fl spectrometry or FRET analysis. Any of these three assays is compatible with 

^ multiplexing by means known In the art. 
p [0128] FRET analysis is particularly useful, especially In light of the previously 
m described phosphine/secondary amine cleavage that can result in the appendage of a 
O FRET dye to a fragment during the cleavage reaction. One way to perfomi FRET 
20 detection is to introduce a probe with a fluorophore or a quencher that hybridizes 
differentially to the cleaved strand (representing one allele) compared to the non- 
cleaved strand (representing the other allele). Such differential hybridization is readily 
achievable because one strand Is longer than the other by at least one, and often 
several nucleotides. A fluorophore or quenching group is placed on the primer used to 
25 produce the cleavable polynucleotide such that a FRET relationship is established 
between the moiety on the probe and the moiety on the primer. That is, the absorbing 
and emitting wavelengths of the two moieties are matched and the distance and 
orientation of the two moieties is correct. A signal will be detected for one allele but not 
the other when the probe and primer are heated to a temperature that causes 
30 denaturation of the shorter allele-probe hybridization product. For example the primer 
could be hybridized to the region that is removed by cleavage in one allele but is 
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present in the other allele. When selecting primers for PGR or primer extension one 
consideration might be to locate the primer so as to maximize the length difference 
between the two alleles. Other means of maximizing the discrimination would include 
the use of a "molecular beacon" strategy where the ends of the probe are 
5 complementary, and fonn a stem, except in the presence of the non-cleaved allele 
where the non-cleaved segment is complementary to the stem of the probe and 
therefore effectively competes with the formation of intramolecular stems in the probe 
molecule. 

[01 29] Another way to produce a FRET signal that discriminates between two 
1 0 variant alleles is to Incorporate a nucleotide containing a dye that interacts with a dye 
m on theO primer. The dye-modified nucleotide is selected such that it is incorporated 
O beyond the primer and the polymorphic site. After cleavage, the nucleotide dye of one 
yj allele (cleaved) will no longer be within resonance producing distance of the primer dye 
tl while, in the other (uncleaved) allele, the proper distance will be maintained and a 
% FRET will occur. A disadvantage of this method is that it requires a purification step to 
£3 remove unincorporated dye molecules that can produce a background signal which 
f j might interfere with FRET detection. A non-limiting example of the experimental steps 
C3 involved in canning out this method are: (1 ) PGR with dye-labeled primer and either a 
Rj cleavable modified nucleotide also carrying a dye or one cleavable modified nucleotide 
20 and one dye-labeled nucleotide. The dye can be on the cleavable nucleotide if the 
cleavage mechanism results in separation of the dye from the primer. For instance, in 
the case of 5'-amino substitution, which results in cleavage proximal to the sugar and 
base of the nucleotide; (2) cleavage at the cleavable modified nucleotide; (3) 
purification to remove free nucleotides; and (4) FRET detection. 
25 [0130] Another example, a genotyping assay, would begin with PGR using one 
modified nucleotide along with three natural nucleotides. The PGR primers would be 
designed such that the polymorphic base is near one of the primers and there is no 
cleavable base between the primer and the polymorphic base. If the cleavable base is 
one of the polymorphic bases, the primer-containing cleavage product from that allele 
30 will be shorter than the product from the other allele. 
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[01 31 ] Any technique that permits determination of the mass of relatively large 
molecules without causing non-specific disintegration of the molecules in the process 
may be used with the methods of this invention. A presently prefen-ed technique is 
MALDI mass spectroscopy, which is well suited to the analysis of complex mixtures. 
5 Commercial MALDI instruments are available which are capable of measuring mass 
with an accuracy on the order of 0.05 to 0.1%. That is, these instruments are capable 
of resolving molecules differing in molecular weight by as little as one part in two 
thousand under optimal conditions. Advances in MALDI MS technology will likely 
increase the obtainable resolution in the future, thus increasing the utility of this 
1^ invention. The smallest difference that can occur between two variant strands is an A-T 
C« transversion, a molecular weight difference of 9 (Table 5). A MALDI mass spec having 
j a resolution of 2,000 (that is, a machine capable of distinguishing an ion with an m/z 
S (mass/charge) of 2,000 from an ion with an m/z of 2,001 ) would be able to detect an A- 
K T transversion in an approximately 18,000 Dalton sequence. A 'Dalton' is a unit of 
•fe molecular weight used when describing the size of large molecules; for all intents and 

purposes it is equivalent to molecular weight. In actual use, the practical resolving 
Q power of an instrument may be limited by the isotopic heterogeneity of carbon; i.e., 
5 carbon exists in nature as Carbon-12 and Carbon-13, and other factors. Assuming an 
' approximately even distribution of the four nucleotides in a DNA fragment, this 
20 translates to detection of an A-T transversion in an oligonucleotide containing about 55 
nucleotides. At the other end of the spectrum, a single C-G transversion, which results 
in a molecular weight difference of 40, could be detected in a 246 nt oligonucleotide by 
MALDI mass spec. 

[01 32] The size of an oligonucleotide In which an A-T transversion is detectable 
25 could be increased by substituting a heavier non-natural nucleotide for either the A or 
the T, for example by replacing A with 7-methyl-A, thus increasing the molecular weight 
change to 23. Another possibility would be to substitute 2-chloroadenine, which has a 
mass of 364.5, for A. It has been shown that 2-chloradenine is readily incorporated into 
polynucleotides by DNA polymerase from Thenmus aquaticus. (Hentosh, P. Anal. 
30 Biochem. . 1992, 201 : 277-281 ). As shown in Table 1 , this has a favorable effect on 
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mass differences between all the nucleotides and A. Most importantly, it changes the 
T-A difference from 9 Da to 42.3 Da. 

[0133] Table 5 shows the approximate size of an oligonucleotide in which each 
possible single point mutation could be detected by mass spectrometers of different 
resolving power without any molecular weight modification. 
EXAMPLES 

1 . Cleavage of Base-modified Polynucieotide Using Secondary Amines 
[0134] As noted previously, secondary amines having high boiling points at 
atmospheric pressure (>100" C, preferably >150*' C, most preferably >200° C) are 
presently prefen-ed chemical bases for cleaving base-modified polynucleotides. These 
amines have several advantages. Their high boiling points results in less, if any, of the 
amine being lost due to volatilization during the cleavage, which provides improved 
control of stochiometry during cleavage and of purification prior to MALDI-TOF mass 
spectrometric analysis. In addition, the high boiling amines are substantially less 
odiferous at the temperatures required for cleavage, i.e., 80* C or higher, preferably, at 
present, greater than 90° C and most preferable at present between 90° and 100° C. 
Presently preferred high boiling secondary amines include 3-pyrridinol, 2-pyrrolidine- 
methanol, 3-pyrrolidinemethanol and 4-piperidineethanol. 

[0135] Fig. 2 shows the results of cleavage of the oligonucleotides shown in Fig. 
1 after the four natural nucleotides, A, G, C, and T, have been individually replaced by a 
con-esponding base-modified nucleotide, that is, 7-nitro-7-deaza-dA for A, 7-nitro-7- 
deaza-dG for G, 5-hydroxy-dC for C and 5-hydroxy-dU for T. In Fig. 2, the underiined 
nucleotides are the primers used to amplify the polynucleotides. The primers are 
comprised entirely of natural nucleotides that are not modified and do not participate in 
any way in the cleavage reaction. Prior to cleavage, the modified polynucleotides were 
oxidized with potassium pemnanganate (approximately 2 mM KMn04 for 5 minutes at 
room temperature). Cleavage was then accomplished by treating the oxidation 
products with the indicated secondary amines at 95° C for one hour. As can be seen in 
Fig. 2, this resulted in incomplete cleavage. The best result was obtained with 3- 
pyrridinol. It is thought that this might be due to less steric hindrance than with 2- 
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pyrrolidinemethanol and higher nucleophilicity and basicity than the piperidine 
compound. 

[0136] Since the three secondary amines have such high boiling points and, in 
addition, are relatively water soluble due in part to the hydroxyl groups, use of a higher 
5 concentration and higher temperature was considered. Thus, when the same four 
base-modified polynucleotides were subjected, after oxidation with KMn04, to 1.46 M 3- 
pyrridlnol for one hour at 98° C, base-modified nucleotide specific, complete cleavage 
was obtained (Fig.3). Without being bound to a particular theory, it is suspected that 
this may be due to less steric hindrance than the 2-pyrrolidinemethanol and better 

1 0 nucleophilicity and basicity than 4-piperidineethanol. 

E [0137] Since the electron-withdrawing effect of hydroxy! groups is generally 
0 detrimental to the nucleophilicity and basicity of a secondary amine, it was postulated 
2j that positioning this group further from the amino group than in 3-pyrrolidinol might 
r: provide an even better cleavage reagent. To avoid potential steric problems, 3- 
iB pyrrolidine- methanol was selected as the compound of choice. 3-Pynrolidinemethanol 
Q was synthesized according to the procedure of Goulet et al. ("SRC kinase inhibitor 
tl compounds", PCT/USOO/1 751 0; WO01/00207 Al ), shown in Fig. 4. When the 7-nitro- 

011 7-deaza-dA-containing modified polynucleotide was subjected to cleavage using 1 .04 M 
pj and 1 .4 M 3-pyrrolidinemethanol versus 1 .1 M and 1 .46 M pyrrolidinol at 98° for one 

20 hour, the fornier provided better cleavage results than the latter, even at lower 

concentrations (Fig. 5). The ability to use a lower concentration of amine may have 
advantages in subsequent sample preparation prior to mass spectrometry. 
2. Genotyping 

[0138] Figure 6 is a schematic representation of genotyping using the method of 
25 this invention. One of the primers (Primer 1 ) is designed to be close to the polymorphic 
site so that one of the polymorphic bases (e.g., A), when replaced with a modified 
nucleotide will be the first cleavage site. PGR amplification with the modified nucleotide 
and the three natural nucleotides provides the two alleles of which only one would be 
cleavable at the polymorphic site. Treatment of the cleavable allele with chemical 
30 reagents gives a fragment that contains Primer 1 . The length of this fragment will 
reveal the genotype of the sample. Analysis of the fragment can be carried out by, 
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without limitation, mass spectrometry or electrophoresis. IVIass spectrometry analysis 
might also reveal the single base difference on the complementary strand of DNA that 
contains the polymorphism, providing built-in redundancy and higher accuracy. 
[0139] Illustrated In Figure 7 to 9 are the chemical cleavage and analysis 
5 procedures utilized to genotype transfemn receptor (TR) gene. An 82bp DNA 
sequence of TR gene was selected based on the location of the polymorphism and 
efficiency of amplification (Figure 7(a)). The polymorphic base (A or G) was positioned 
3 bases from the 3' end of Primer 1 . For the A allele it is the first modified nucleotide to 
be incorporated. For the G allele, the first cleavable nucleotide is 6 bases from the 
1.0 primer. As a result, cleavage will produce fragments of different lengths. The PGR 
C3 amplification reactions (50 jxl each) were carried out In standard buffer with polymerase 
1 AmpliTaq Gold (0.1 unit/^l) on a Thennocycler (MJ Research PTC-200) using 35 cycles 
SI of amplification (1 min denaturation, 1 .5 min annealing, and 5 min extension). Analysis 
N:: of the PGR products on a 5% non-denaturing polyacrylamide gel (stained with Stains- 
-fc All from Sigma) showed that 7-deaza-7-nitro-dATP can replace dATP and still result in 
Jf an efficient PGR amplification. 

h [0140] To the PGR products were added piperldine, tris-(2-carboxylethyl)- 

S phosphine (TGEP), and Tris base at a final concentration of 1 M, 0.2 M, and 0.5 M, 
respectively, in a total volume of 100 |al. After incubation at 95** G for 1 hour, 1 ml of 0.2 

20 M triethylammonium acetate (TEAA) was added to each reaction mixture and the 
resulting solution purified on an OASIS column (Waters). The eluted products were 
concentrated to dryness on a Speedvac and the residue analyzed by mass 
spectrometry or electrophoresis. Figure 7(b) shows the sequences expected from 
cleavage at 7-deaza-7-nitro-dA. The sequences are grouped according to lengths and 

25 molecular weights. The first group contains longer fragments that contain the primer 
sequence. The 22nt fragment is invariant and may be used as an internal reference. 
The 25nt or 28nt fragment is expected from the A or G allele, respectively. The shaded 
group of sequences are from the complementary strand of DNA, Including Invariant 
1 3nt and 1 1 nt fragments that can be used as Intemal references and a pair of 1 1 nt 

30 fragments expected firom two allelic forms of the TR gene with a 1 5 Da mass difference. 
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[0141] Shown in figure 8 Is a MALDI-TOF spectrum of chemically cleaved 
products from an 82bp heterozygote TR DNA sample. Highlighted in the spectrum are 
the two regions that contain fragments depicted in Figure 7(b). 
[0142] Each purified cleavage sample was mixed with 3-hydroxypicolinic acid 
5 and subjected to MALDI-TOF analysis on a Perceptive Biosystems Voyager-DE mass 
spectrometer. IVIass spectra in the region of 7000-9200 Daltons were recorded and the 
results for the three TR genotypes are shown in Figure 9. The spectra were aligned 
using the peak representing the invariant 22nt fragment (7189 Da). Two additional 
peaks were observed for the AG heterozygote sample with one conresponding to the A 
10 allele (8057 Da) and the other to the G allele (9005 Da). As expected, only one 
Q additional peak was observed for the GG or AA homozygote samples, each with the 

molecular weight of cleavage fragments from G or A allele. 
W [0143] Figure 10(a) shows a mass spectrum of the AG heterozygote sample in 
U the region of 3700-4600 Da. With 3807 Da and 4441 Da fragments as internal 
fi references, the genotype of this sample was confirmed through the observation of two 
CI peaks in the middle of the spectrum with 1 5 Da mass difference. The molecular 
f% weights observed by mass spectrometry indicated that phosphate-deoxyribose-TCEP 
2 adducts were fomried during the cleavage reaction, resulting in fragments that are 
W modified at their 3' ends (Figure 10(b)). The data shown in Figures 8 and 10 also 
20 illustrate that the combination of chemical cleavage and mass spectrometry can provide 
con-oborating genotyping infonnation from both strands of DNA, thereby assuring the 
accuracy of the analysis. 

[01 44] Altematively , the chemically cleaved samples may be analyzed by 
electrophoresis. Capillary electrophoresis (CE) analyses were perfomried using a 

25 homemade instrument with a UV detector and a capillary containing denaturing linear 
polyacrylamide gel. Figure 1 1 shows the CE chromatogram obtained from TR samples 
of various genotypes. As predicted, each genotype gave a different elution pattern 
corresponding to the lengths of the cleavage products. Whereas the AA homozygote 
produced a 25nt fragment and GG homozygote generated a 28nt firagment, the AG 

30 heterozygote sample afforded both 25nt and 28nt products. After being labeled at 5' 
end with ^^P, the cleavage samples were subjected to PAGE analysis. The resulting 
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autoradiogram (Figure 12) demonstrates that the cleavage is specific with little or no 
background and the genotyping results are unambiguous. 

[0145] Another useful detection method for this analysis would be FRET. In fact, 
FRET has been successfully applied for polymorphism detection using TaqMan® 
5 assays (J. A. Todd, et al., 1995, Nature Genetics . 3:341-342) and Molecular Beacons 
(S. Tyagi, et al., 1998, Nature Biotechnology . 16:49-53). However, when longer probes 
are necessary to achieve hybridization to target sequences (e.g., AT rich sequences), it 
becomes increasingly difficult to distinguish the small difference resulting from a single 
nucleotide mismatch. The advantage of chemical cleavage in this regard is illustrated 

10 in Figure 13. Similar to the aforementioned example, a modified nucleotide analog of 
one of the polymorphic bases (e.g.. A) is substituted for its natural counterpart in a PGR 

f J amplification. Primer 1 is designed to be close to the polymorphic site so that the 
modified A would be the first cleavable nucleotide in the A allele. Primer 1 is also 

5 1 it 

pi labeled with a fluorescent group (F1) positioned close to its 3' end (Figure 13(a)). After 
1^ amplification and chemical cleavage, a probe covalently attached to another 
L fluorophore F2 (Figure 13(b)) can be added and the FRET between the two 
N fluorophores measured. Because one of alleles was cleaved closer to the 3' end of 

11 primer 1 than the other, the difference In their hybridization is expected to be greater 
^ than a single nucleotide mismatch, which may be exploited to distinguish the two 
20 alleles. As shown in Figure 13(c), the temperature can be adjusted so that only the 

longer fragment obtained from the G allele will hybridize with the probe, resulting in 
FRET. Since in this system a "NO FRET" result could be interpreted either as allele A 
or failed PGR amplification, it is necessary to measure the fluorescence of each sample 
at various temperatures to ensure positive detection of the shorter fragment from allele 
25 A at a lower temperature. 

CONCLUSION 

[0146] Thus, it will be appreciated that the method of the present invention 
provides versatile tools for the detection of polymorphisms among related 
polynucleotides. 

30 [0147] Although certain embodiments and examples have been used to describe 
the present invention, it will be apparent to those skilled in the art that changes in the 
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embodiments and examples shown may be made without departing from the scope of 
this invention. 

[01 48] Other embodiments are contained within the following claims. 
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